45 research outputs found

    Mining Discourse Treebanks with XQuery

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 245-256. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

    Semantics-based Question Generation and Implementation

    Get PDF
    This paper presents a question generation system based on the approach of semantic rewriting. The state-of-the-art deep linguistic parsing and generation tools are employed to convert (back and forth) between the natural language sentences and their meaning representations in the form of Minimal Recursion Semantics (MRS). By carefully operating on the semantic structures, we show a principled way of generating questions without ad-hoc manipulation of the syntactic structures. Based on the (partial) understanding of the sentence meaning, the system generates questions which are semantically grounded and purposeful. And with the support of deep linguistic grammars, the grammaticality of the generation results is warranted. Further, with a specialized ranking model, the linguistic realizations from the general purpose generation model are further refined for our the question generation task. The evaluation results from QGSTEC2010 show promising prospects of the proposed approach

    Dual-mode adaptive-SVD ghost imaging

    Full text link
    In this paper, we present a dual-mode adaptive singular value decomposition ghost imaging (A-SVD GI), which can be easily switched between the modes of imaging and edge detection. It can adaptively localize the foreground pixels via a threshold selection method. Then only the foreground region is illuminated by the singular value decomposition (SVD) - based patterns, consequently retrieving high-quality images with fewer sampling ratios. By changing the selecting range of foreground pixels, the A-SVD GI can be switched to the mode of edge detection to directly reveal the edge of objects, without needing the original image. We investigate the performance of these two modes through both numerical simulations and experiments. We also develop a single-round scheme to halve measurement numbers in experiments, instead of separately illuminating positive and negative patterns in traditional methods. The binarized SVD patterns, generated by the spatial dithering method, are modulated by a digital micromirror device (DMD) to speed up the data acquisition. This dual-mode A-SVD GI can be applied in various applications, such as remote sensing or target recognition, and could be further extended for multi-modality functional imaging/detection

    Mining Implicit Relevance Feedback from User Behavior for Web Question Answering

    Full text link
    Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior recorded in search engine logs. All previous works on mining implicit relevance feedback target at relevance of web documents rather than passages. Due to several unique characteristics of QA tasks, the existing user behavior models for web documents cannot be applied to infer passage relevance. In this paper, we make the first study to explore the correlation between user behavior and passage relevance, and propose a novel approach for mining training data for Web QA. We conduct extensive experiments on four test datasets and the results show our approach significantly improves the accuracy of passage ranking without extra human labeled data. In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine, especially for languages with low resources. Our techniques have been deployed in multi-language services.Comment: Accepted by KDD 202

    Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture

    Full text link
    We describe a new deep learning architecture for learning to rank question answer pairs. Our approach extends the long short-term memory (LSTM) network with holographic composition to model the relationship between question and answer representations. As opposed to the neural tensor layer that has been adopted recently, the holographic composition provides the benefits of scalable and rich representational learning approach without incurring huge parameter costs. Overall, we present Holographic Dual LSTM (HD-LSTM), a unified architecture for both deep sentence modeling and semantic matching. Essentially, our model is trained end-to-end whereby the parameters of the LSTM are optimized in a way that best explains the correlation between question and answer representations. In addition, our proposed deep learning architecture requires no extensive feature engineering. Via extensive experiments, we show that HD-LSTM outperforms many other neural architectures on two popular benchmark QA datasets. Empirical studies confirm the effectiveness of holographic composition over the neural tensor layer.Comment: SIGIR 2017 Full Pape

    Quantitative and dark field ghost imaging with ultraviolet light

    Full text link
    Ultraviolet (UV) imaging enables a diverse array of applications, such as material composition analysis, biological fluorescence imaging, and detecting defects in semiconductor manufacturing. However, scientific-grade UV cameras with high quantum efficiency are expensive and include a complex thermoelectric cooling system. Here, we demonstrate a UV computational ghost imaging (UV-CGI) method to provide a cost-effective UV imaging and detection strategy. By applying spatial-temporal illumination patterns and using a 325 nm laser source, a single-pixel detector is enough to reconstruct the images of objects. To demonstrate its capability for quantitative detection, we use UV-CGI to distinguish four UV-sensitive sunscreen areas with different densities on a sample. Furthermore, we demonstrate dark field UV-CGI in both transmission and reflection schemes. By only collecting the scattered light from objects, we can detect the edges of pure phase objects and small scratches on a compact disc. Our results showcase a feasible low-cost solution for non-destructive UV imaging and detection. By combining it with other imaging techniques, such as hyperspectral imaging or time-resolved imaging, a compact and versatile UV computational imaging platform may be realized for future applications.Comment: 9 pages, 5 figure

    Question Generation with Minimal Recursion Semantics

    No full text
    Question Generation (QG) is the task of generating reasonable questions from a text. It is a relatively new research topic and has its potential usage in intelligent tutoring systems and closed-domain question answering systems. Current approaches include template or syntax based methods. This thesis proposes a novel approach based entirely on semantics. Minimal Recursion Semantics (MRS) is a meta-level semantic representation with emphasis on scope underspecification. With the English Resource Grammar and various tools from the DELPH-IN community, a natural language sentence can be interpreted as an MRS structure by parsing, and an MRS structure can be realized as a natural language sentence through generation. There are three issues emerging from semantics-based QG: (1) sentence simplification for complex sentences, (2) question transformation for declarative sentences, and (3) generation ranking. Three solutions are also proposed: (1) MRS decomposition through a Connected Dependency MRS Graph, (2) MRS transformation from declarative sentences to interrogative sentences, and (3) question ranking by simple language models atop a MaxEnt-based model. The evaluation is conducted in the context of the Question Generation Shared Task and Generation Challenge 2010. The performance of proposed method is compared against other syntax and rule based systems. The result also reveals the challenges of current research on question generation and indicates direction for future work.

    FEATURE-DRIVEN QUESTION ANSWERING WITH NATURAL LANGUAGE ALIGNMENT

    Get PDF
    Question Answering (QA) is the task of automatically generating answers to natural language questions from humans, serving as one of the primary research areas in natural language human-computer interaction. This dissertation focuses on English fact-seeking (factoid) QA, for instance: when was Johns Hopkins founded? (January 22, 1876). The key challenge in QA is the generation and recognition of indicative signals for answer patterns. In this dissertation I propose the idea of feature-driven QA, a machine learning framework that automatically produces rich features from linguistic annotations of answer fragments and encodes them in compact log-linear models. These features are further enhanced by tightly coupling the question and answer snippets via monolingual alignment. In this work monolingual alignment helps question answering in two aspects: aligning semantically similar words in QA sentence pairs (with the ability to recognize paraphrases and entailment) and aligning natural language words with knowledge base relations (via web-scale data mining). With the help of modern search engines, database and machine learning tools, the proposed method is able to efficiently search through billions of facts in the web space and optimize from millions of linguistic signals in the feature space. QA is often modeled as a pipeline of the form: question (input) -> information retrieval (“search”) -> answer extraction (from either text or knowledge base) -> answer (output). This dissertation demonstrates the feature-driven approach applied throughout the QA pipeline: the search front end with structured information retrieval, the answer extraction back end from both unstructured data source (free text) and structured data source (knowledge base). Error propagation in natural language processing (NLP) pipelines is contained and minimized. The final system achieves state-of-the-art performance in several NLP tasks, including answer sentence ranking and answer extraction on one QA dataset, monolingual alignment on two annotated datasets, and question answering from Freebase with web queries. This dissertation shows the capability of a feature-driven framework serving as the statistical backbone of modern question answering systems
    corecore